Transformation-based and Memory-based Learning for Detecting Speech Recognition Errors
نویسنده
چکیده
This paper presents some initial experiment with transformation-based and memory-based learning for detecting errors on the word level in speech recognition results. Features that were tested include word confidence scores along with lexical, contextual and pragmatic information. The results show that the best classifier performs 11.9% better than baseline for all words, and 17.9% for content words, with the richest set of feature. Both learners’ performance was approximately equal. However, the classifiers often disagreed, which indicates that an ensemble method may be useful.
منابع مشابه
Early error detection on word level
In this paper two studies are presented in which the detection of speech recognition errors on the word level was examined. In the first study, memory-based and transformation-based machine learning was used for the task, using confidence, lexical, contextual and discourse features. In the second study, we investigated which factors humans benefit from when detecting errors. Information from th...
متن کاملError Detection Using Linguistic Features
Recognition errors hinder the proliferation of speech recognition (SR) systems. Based on the observation that recognition errors may result in ungrammatical sentences, especially in dictation application where an acceptable level of accuracy of generated documents is indispensable, we propose to incorporate two kinds of linguistic features into error detection: lexical features of words, and sy...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملTransformation-based error correction for speech-to-text systems
We present a universal approach to uncover and correct systematic local errors in complex speech-to-text systems. Whereas previous work to minimize speech recognition errors mostly relies on N-best lists or word lattices, our approach is merely based on the first-best system output. The paradigm of Transformation-Based Learning (TBL) is adapted from tagging-like applications to themore complica...
متن کامل